Complex Annotations with NooJ

نویسنده

  • Max Silberztein
چکیده

NooJ associates each text with a Text Annotation Structure, in which each recognized linguistic unit is represented by an annotation. Annotations store the position of the text units to be represented, their length, and linguistic information. NooJ can represent and process complex annotations, such as those that represent units inside word forms, as well as those that are discontinuous. We demonstrate how to use NooJ‟s morphological, lexical, and syntactic tools to formalize and process these complex annotations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syntactic parsing with NooJ

When parsing a text, NooJ’s parsers store all the annotations that they produce in the Text’s Annotation Structure (TAS). At each level of the various linguistic analyses and the corresponding parser, a given parser may add annotations to, or remove annotations from, the TAS. As annotations are attached to larger and larger sequences of texts, the TAS represents the hierarchical structure of th...

متن کامل

NooJ: a Linguistic Annotation System for Corpus Processing

One characteristic of NooJ is that its corpus processing engine uses large-coverage linguistic lexical and syntactic resources. This allows NooJ users to perform sophisticated queries that include any of the available morphological, lexical or syntactic properties. In comparison with INTEX, NooJ uses a new technology (.NET), a new linguistic engine, and was designed with a new range of applicat...

متن کامل

Formalisation de l'amazighe standard avec NooJ (Formalization of the standard Amazigh with NooJ) [in French]

Dans cette perspective, et dans le but de développer des outils et des ressources linguistiques, nous avons entrepris de construire un module NooJ pour la langue amazighe standard (Ameur et al., 2004). Le présent article propose une formalisation de la catégorie nom permettant de générer à partir d’une entrée lexicale son genre (masculin, féminin), son nombre (singulier, pluriel), et son état (...

متن کامل

Morphological study of Albanian words, and processing with NooJ

We are developing electronic dictionaries and transducers for the automatic processing of the Albanian Language. We will analyze the words inside a linear segment of text. We will also study the relationship between units of sense and units of form. The composition of words takes different forms in Albanian. We have found that morphemes are frequently concatenated or simply juxtaposed or contra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010